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Abstract. The use of formal methods provides confidence in the cor- 
rectness of developments. Yet one may argue about the actual level of 
confidence obtained when the method itself - or its implementation - is 
not formally checked. We address this question for the B, a widely used 
formal method that allows for the derivation of correct programs from 
specifications. Through a deep embedding of the B logic in Coq, we check 
the B theory but also implement B tools. Both aspects are illustrated 
by the description of a proved prover for the B logic. 
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A clear benefit of formal methods is to increase the confidence in the cor- 
rectness of developments. However, one may argue about the actual level of 
confidence obtained, when the method or its implementation are not themselves 
formally checked. This question is legitimate for safety, as one may accidentally 
derive invalid results. It is even more relevant when security is a concern, as 
any flaw can be deliberately exploited by a malicious developer to obfuscate 
undesirable behaviours of a system while still getting a certification. 

B [I] is a popular formal method that allows for the derivation of correct 
programs from specifications. Several industrial implementations are available 
(e.g. AtelierB, B Toolkit), and it is widely used in the industry for projects where 
safety or security is mandatory. So the B is a good candidate for addressing our 
concern: when the prover says that a development is right, who says that the 
prover is right? To answer this question, one has to check the theory as well as 
the prover w.r.t. this theory (or, alternatively, to provide a proof checker). Those 
are the objectives of BiCoq, a deep embedding of the B logic in Coq [SJ. 

BiCoq benefits from the support of Coq to study the theory of B, and to check 
the validity of standard definitions and results. BiCoq also allows us, through an 
implementation strategy, to develop formally checked B tools. This strategy is 
illustrated in this paper by the development of a prover engine for the B logic, 
that can be extracted and used independently of Coq. Coq is therefore our notary 
public, witnessing the validity of the results associated to the B theory, as well 
as the correctness of tools implementing those results - ultimately increasing 
confidence in B developments. The approach, combining a deep embedding and 



an implementation technique, can be extended to address further elements of 
the B, beyond its logic, or to safely enrich it, as illustrated in this paper. 

This paper is divided into 9 sections. Sections [TJ [5] and [3] briefly introduce 
B, Coq and the notion of embedding. The B logic and its formalisation in Coq 
are presented in Sec. |U Section [5] describes various results proved using BiCoq. 
Section [5] focuses on the implementation strategy, and presents its application 
to the development of a set of extractible proof tactics for a B prover. Section [7] 
discusses further uses of BiCoq, and mentions some existing extensions. Finally, 
Sect. [5] concludes and identifies further activities. 



1 A Short Introduction to B 

In a nutshell, the B method defines a first-order predicate logic completed with 
elements of set theory, a Generalised Substitution Language ( GSL) and a method- 
ology of development. An abstract B machine is a module combining a state, 
properties and operations (described as substitutions) to read or alter the state. 

The logic is used to express preconditions, invariants, etc. and to conduct 
proofs. The GSL allows for definitions of substitutions that can be abstract, 
declarative and non-deterministic (that is, specifications) as well as concrete, 
imperative and deterministic (that is, programs). The following example uses 
the non-deterministic substitution ANY (a "magic" operator finding a value 
which satisfies a property) to specify the square root of a natural number n: 

Example 1. ANY x WHERE x*x <n< (x+l)*(x+l) THEN ^/In) :=x END 

Regarding the methodology, a machine Mq refines an abstract machine Ma 
if one cannot distinguish Mq from Ma by valid operation calls - this notion 
being independent of the internal representations, as illustrated by the following 
example of a system returning the maximum of a set of stored values: 

Example 2. The state of Ma is a (non implementable) set of natural numbers; the 
state of Mc is a natural number. Yet Mc, having the expected behaviour, refines Ma- 

MACHINE M A REFINEMENT M c 

VARIABLES S VARIABLES s 

INVARIANT 5CN INVARIANT s = max(SU{0}) 

INITIALISATION S:=9 INITIALISATION s := 

OPERATIONS OPERATIONS 

store(n) = store(n) = 

PRE n €N THEN S := SU{n} END IF s < n THEN s:=n END 

, A , A 

m<—get= m<—get = 

PRE 5/0 THEN m := max(S) END BEGIN m := s END 

END END 

Refinement being transitive, it is possible to go progressively from the specifi- 
cation to the implementation. By discharging at each step the proof obligations 
defined by the B methodology, a program can be proved to be a correct and 
complete implementation of a specification. This methodology, combined with 



the numerous native notions provided by the set theory and the existence of 
toolkits, make the B a popular formal method, widely used in the industry. 

Note that the B logic is not genuinely typed and allows for manipulation of 
free variables. A special mechanism, called type-checking (but thereafter referred 
to as wf- checking), filters ill- formed (potentially paradoxal) terms; it is only 
mentioned in this paper, deserving a dedicated analysis. 

The rest of the paper only deals with the B logic (its inference rules). 

2 A Short Introduction to Coq 

Gog is a proof assistant based on a type theory. It offers a higher-order logical 
framework that allows for the construction and verification of proofs, as well 
as the development and analysis of functional programs in an ML-like language 
with pattern-matching. It is possible in Coq to define values and types, including 
dependent types (that is, types that explicitly depend on values); types of sort 
Set represent sets of computational values, while types of sort Prop represent 
logical propositions. When defining an inductive type (that is, a least fixpoint), 
associated structural induction principles are automatically generated. 

For the intent of this paper, it is sufficient to see Coq as allowing for the 
manipulation of inductive sets of terms. For example, let's consider the standard 
representation of natural numbers: 

Example 3. Inductive N : Set := : N | 5:N— »N 

It defines a type N which is the smallest set of terms stable by application of the 
constructors and S. N is exactly made of the terms and S n (0) for any finite 
n; being well-founded, structural induction on N is possible. 

Coq also allows for the declaration of inductive logical properties, e.g.: 

Example 4- Inductive ev.N— >Prop:= ev :ev | ev 2 :V(n:N), ev n^ev (S(S n)) 

It defines a family of logical types: ev is a type inhabited by the term (evo), 
ev 2 is another type inhabited by (ev2 evo), and ev 1 is an empty type. The 
standard interpretation is that evo is a proof of the proposition ev and that 
there is no proof of ev 1, that is we have ->(ev I). 

An intuitive interpretation of our two examples is that N is a set of terms, 
and ev a predicate marking some of them, defining a subset of N. 

3 Deep Embedding and Related Works 

Embedding in a proof assistant consists in mechanizing a guest logic by encoding 
its syntax and semantic into a host logic ( [31415] ). In a shallow embedding, 
the encoding is partially based on a direct translation of the guest logic into 
constructs of the host logic. In a deep embedding the syntax and the semantic 
are formalised as datatypes. At a fundamental level, taking the view presented 
in Sec. [2] the deep embedding of a logic is simply a definition of the set of 



all sequents (the terms) and a predicate marking those that are provable (the 
inference rules of the guest logic being encoded as constructors of this predicate) . 

Shallow embeddings of B in higher-order logics have been proposed in sev- 
eral papers (cf. |6I7| ) formalising the GSL in PVS, Coq or Isabelle/HOL. Such 
embeddings are not dealing with the B logic, and by using directly the host 
logic to express B notions, they introduce a form of interpretation. If the objec- 
tive is to have an accurate formalisation of the guest system, the definition of a 
valid interpretation is difficult - e.g. B functions are relations, possibly partial or 
undecidable, and translating accurately this concept in Coq is a tricky exercise. 

BiCoq aims at such an accurate formalisation, to pinpoint any problem of 
the theory with the objective to increase confidence in the developments when 
safety or security is a concern; in addition, we also have an implementation 
objective. In such deep embedding is fully justified - see for example 

the development of a sound and complete theorem prover for first-order logic 
verified in Isabelle proposed in [S]. 

A deep embedding of the B logic in Coq is described in [9] (using notations 
with names), to validate the base rules used by the prover of Atelier-B - yet 
not checking standard B results, and without implementation goal. As far as 
the implementation of a trusted B prover is concerned, we can also mention the 
encoding of the B logic as a rewriting system proposed in [TU] , 

Deep embeddings have also the advantage to clearly separate the host and the 
guest logics: in Bicoq, excluded middle, provable in P, is not promoted to Coq. 
This improves readibility, and allows one to study meta-theoretical questions 
such as consistency. Furthermore, the host logic consistency is not endangered. 

4 Formalising the B Logic in Coq 

In this section, we present our embedding of the B logic in the Coq system; 
the embedding uses a De Bruijn representation that avoids ambiguities and 
constitutes an efficient solution w.r.t. the implementation objective (see [11112] ). 
Deviations between B and its formalisation are described and justified. 

Notation. B definitions use upper case letters with standard notations. BiCoq uses 
lower case letters, and mixes B and Coq notations; standard notations are used for Coq 
(e.g. V is the universal quantification) while dotted notations are used for the embedded 
B (e.g. V is the universal quantification constructor). 

Notation. [T] denotes the type of the lists whose elements have type T. 
4.1 Syntax 

Given a set of identifiers (I), the B logic syntax defines predicates (P), expres- 
sions (E), sets (S) and variables (V) as follows: 

P ~ PAP | P^P | -P | V V • P | E = E | EeE | [V:=E]P 

E := V 5 | E^E \ IS | [V~E]E 

5 := BIG j t S 5x5 | {V\P} 

V := I j V,V 



In this syntax, [V:—E]T represents the (elementary) substitution, V±, V2 a list of 
variables, E\^Ei apair of expressions, \ and \ the choice and power set operators, 
and BIG a constant set. The comprehension set operator, while syntactically 
defined by {V^P}, is rejected at wf-checking if not of the form {V\V G SAP}, 
with V a variable not free in S 

Definition. Other connectors are defined from the previous ones, P<$Q is defined as 
P^Q A Q^P, PVQ as -^P^Q, and 3 V-P as V—P. 

The first design choice of BiCoq is to use a pure nameless De Bruijn notation 
(see [11113) ). where variables are represented by indexes giving the position of 
their binder - here the universal quantifier and the comprehension set. When 
an index exceeds the number of parent binders, it is said to be dangling and 
represents a free variable, whose name is provided by a scope (left implicit in 
this paper), so that any syntactically correct term is semantically valid, and there 
is no need for well-formedness conditiorQ In this representation, proofs of side 
conditions related to name clashing are replaced by computations on indexes, 
but the index representing a variable is not constant in a term. 

The B syntax is formalised in Coq by two mutually inductive types with 
the following constructors, I being the set of indexes (that is, N\{0}) and JJ an 
infinite set of names with a decidable equality: 

P := PAP I P^>P I AP j VP I E=E I EgE 

E :- \l I E> E jE j Q j fE | ExE | {EjP} | u$ 

F represents B predicates, while E merges B expressions, sets and variables. 

Using a De Bruijn representation, binders V and {|} have no attached names 
and only bind (implicitly) a single variable. Binding over list of variables can be 
eliminated without loss of expressivity, as illustrated by the following example: 

Example 5. {V \ V G Si xS 2 A 3Vi ■ (Vi £ Si A 3V 2 • (Va G S2AV1 >-> V 2 = W\P))} represents 
{Vi.Va I Vi,U 2 GSixS 2 AP}0 

The constructor {|} is further modified to be parameterised by an expression, 
to keep in the syntax definition only wf-checkable terms. Indeed, only compre- 
hension sets of the form {V \ Ve E A P}, with V not free in E, are valid. The 
BiCoq representation of this set is {e|p}; to reflect the non-freeness condition, 
{e\p} only binds variables in its predicate parameter p. By these design choices, 
we bridge the gap between syntactically correct terms and wf-checkable ones, 
while being conservative. 

fl represents the constant set BIG, \ unary (De Bruijn) variables. The 
constructor uj is without B equivalent, and provides elements of Q (cf. Par. l4.3p . 

Notation. Xi denotes the application of constructor \ to i : I and uij of constructor 
uj to j : J. By abuse of notation the variable \i * s a lso denoted simply by i. 

4 An alternative approach to avoid well-formedness conditions is described in |14j . 

5 This second representation, while standard in B, appears to be an illegal binding 
over the expression xt-^y rather than over the variable x,y, but the same notations 
are used for both in [I] and such confusions are frequent. 



Finally, the elementary substitution is not considered in BiCoq as a syn- 
tactical construct but is replaced by functions on terms - substitution being 
introduced earlier in B only to be used in the description of inference rules. 
Note however that the full GSL of B can still be formalised by additional terms 
constructors (the explicit substitution approach, see |15ll6j ). 

Notation. pi<&p2 is defined as pi=>p2/\p2=>pi, piVp2 as ^pi=>p2, and 3p as -^V^p. 
Notation. T denotes the type of terms, that is the union o/P and E. 

4.2 Dealing with the De Bruijn Notation 

De Bruijn notation is an elegant solution to avoid complex name management, 
and it has numerous merits. But it also has a big drawback, being an unusual 
representation for human readers: 

Example 6. If x € y is the interpretation of the term l£2, the interpretation of the 
term V(l€2) is Vt-tGa;; because of the binder, the scope has shifted (so 2 now represents 
a;), and (likely) the semantic has been distorted. 

In this paragraph, we illustrate some of the consequences of using a De Bruijn 
notation, as well as how to mask such consequences from the users. 

Induction When dchning type T, Coq automatically generates the associated 
structural induction principle. As illustrated in Ex. [Bl it is however not seman- 
tically adequate, because it does not reflect De Bruijn indexes scoping. A more 
interesting principle is derived in BiCoq by using the syntactical depth function 
V of a term as a well-founded measure: 

V(P:T^Prop),(V (i:T),(V(t':T), £>(£')<£>(*) ~* P t') -» P t) -f V(t:T), Pt 

With this principle, for the term V(lG3) (that is, Vt-tEy) we can choose to use 
an induction hypothesis on le2 (that is, x£y) instead of 1G3 (that is, i£z). 

Non-Freeness The B notation V\T means that the variable V does not appear 
free in T. Non-freeness is defined in BiCoq as a type \ :I^T^ Prop (a relation 
between I, representing the variables, and T), with the following ruleqff 

ii 7^ «2 (i + l)\p z\e (i+l)\p 

ixwfe ii\i2 ixVp i\{e|p} 

The two first rules are axioms, the associated constructors are atomic and do 
not interact with variables. The rules for V and {|} reflect the fact that the 
associated constructors are binders and therefore shift the scope. 



The rules for the other constructors are trivial and can be obtained by straightfor- 
ward extension, e.g. here ixp and ixg allow to derive ixp=>g. 



Binding, Instantiation and Substitution It is possible to define functions to 
simulate B binding (that is the use of V or {}, representing A-abstraction) . These 
functions constitute a built-in user interface to produce De Bruijn terms while 
using the usual representation, making De Bruijn indexes and their management 
invisible to the user (see also [17] for a similar approach): 



Usual rep. 



VVi-Vie {V 2 | Vi G EAVi = V 2 } 



Functional rep. Tv(il "iiG t{}(*2 : e-ii 



Internal rep. 




C omputation 



\ 



Pretty-printing 



The binding functions are defined by: 

Tv(i-p) := V Bind i 1 p ]^{i:e-p) := {eJBind Hp} 
Bind(ii i 2 : 1)(t : T) : T := match t with 



V(l6{e|2=l}) 



Ta(j-p) := BBind Hp 



Q I , 
Vp' 

W\p'} 



t if i' <i2, or else Xi 2 if i' = ii, or else Xi'+i 
V(Bind (ii+1) (i 2 + l)p') 
{Bind i\ i 2 e j Bind (ii + 1) (12 + 1) p'} 
... (straightforward extension) 



On the same principles, the definition of instantiation functions (for elimination 
of V or {}, representing /3-reduction and denoted by |v(p e ) : P ^ E ^ P and 
|n(ei ^e^) :E^E^P) is straightforward - being partial, these functions just 
require in Coq an additional proof parameter (omitted in this paper) that the 
term is of the expected form. Finally, it is also possible to define a substitution 
functioifl (i:=e)t :I— »E— >T— >T:= match t with 

O I LJjl t 

Xi' =>• if i' = i then e else t 

j Vp' =>V(i+l:=Lift(e))p' 

{e'jp'} ^{(z:=e) e 'j(z+l:=Lift(e))p'} 

=> . . . (straightforward extension) 

where Lift, not detailed in this paper, increments dangling De Bruijn indexes. 
Remember that substitution is introduced early in B as a syntactical construct, 
but only to be used in inference rules. We consider that such rules are better 
represented using the resulting term (that is, the reduction of the application of 
the substitution). 

Once these functions arc defined, numerous lemmas are proved, such as the 
(in)famous ones describing all possible interactions between lifting, binding, in- 
stantiation and substitution. The following results are then derived, proving 



Substitution and instantiation may seem similar in usual notation, but their differ- 
ences are emphasised when using De Bruijn notation. 



the irrelevance of a-renaming or describing relationships between instantiation, 
binding and substitution (with = the Coq term structural equality): 

i'2 \p— > fv(ii-p) = Tv(«2-(ii :=i2)p) i2\p^ T{}(*i :e 'P)=T{}(*2 :e-(ii :=«2)p) 

lv(Tv(*-p) lo(To(* :e 'P')^ i )=^ e ^f 

|v(Tv(i-p)<-e) = (i:=e)p 
4.3 Inference Rules 

Having formalised the _B syntax and defined some functions and properties on 
terms, the next step is to encode the B inference rules. Thanks to the use of the 
functional representation described in the previous paragraph, BiCoq rules look 
very much like the standard B rules. The translation is therefore straightforward, 
merely a syntactical one, and the risk of error is very limited. 

In our formalisation sets of hypothesis are represented by lists, with mem- 
bership (g) and inclusion (C) as well as the pointwise extension of non-freeness 
(\). The B inference rules are formalised as constructors of an inductive type 
h : [P] — > P — > Prop, that is g h p is the Coq type of all B proofs of p under the 
assumptions g. Such a type may be inhabited (i.e. p is provable assuming g) or 
empty (i.e. there is no proof of p under the assumptions of g). 

The B rules and their encoding as constructors are detailed in Tab. [TJ uni- 
versal quantifications being omitted (the types are g,gi,g2 ■ [P]; P,Pi,P2 ■ P; 
e, ei, e%, e$, e± : E, i, : I and j,ji,j-2 ■ J). For most of them, translation 
is straightforward, only taking care to use functional substitution and binding 
where appropriate. On the other hand, the use of the functional representation 
imposes to keep the syntactical side conditions, except for the comprehension 
set rule, where such condition is embedded in the syntax; new rules have to be 
derived to benefit of the internal De Bruijn representation. 

Only the last two B inference rules deserve discussion. The first one of these 
indicates that the constant set BIG is infinite, using the infinite B predicate 
defined by a fixpoint; unfolding this definition to produce a translation is possi- 
ble, but not practical. Therefore, this rule is replaced in BiCoq by two different 
rules allowing to exhibit an infinity of elements of BIG, J being itself infinite. 

The last rule, defining the semantics of pairs and products, is more interest- 
ing. A straightforward translation of this rule indeed leads to the impossibility 
to prove, in BiCoq, the following theorems from [TJ: 

I- (E^F) = (E'^F') E — E' A F = F' 

h s et u a T e| v (s x T) g| (u x v) 

The proof of the first result provided in [1] is flawed, due to a confusion between 
pairs of expressions and lists of variables (as pointed out in [H]), both using the 
same notation - and cannot be corrected in the absence of a form of destructor 
for pairs. On the other hand, the proof of the monotonicity of cartesian product 
w.r.t. inclusion is not detailed in [Tj, being considered trivial. However, using the 
listed rules, one may derive predicates of the form V G S x T but without being 
able to constraint V to be a pair to apply the last rule (a classical problem of 
the untyped A-calculus). Basically, injectivity and surjectivity rules are lacking; 



Table 1. Encoding of the B inference rules 



B inference rules 



BiCoq formalisation 



PhP 

P appears in P 



p h p 

P' includes F 



r\-p 



r'\- p 
r h p r,p\-Q 



rh-Q 

rhP^Q r,PhQ 



r,PhQ r h p^q 
r h p r h q 



r\- paq 
rhPAQ r\- paq 



r\- p r\- q 
r,QhP r,Qh^p 



r,^QhP r,^Qh^p 



r \- e = e 
PhP v\p 



pi-vv-p 
pkvv-p 



r h [V := P]P 



h Pe{V|Ve5AP}<S>PGS'A[l/:=P]P 

r h g = f r \- [V:=p]p 

P h [V:=P]P 

v_\s 



\-3V-(yeS) ^>iS€S 

V\S,T 



h se^F^ vv- (Ves^- VeT) 
v\s,t 



, vv-(VeS^V€T) 

1 AVV-(V€T^VeS) 1 ''• " ' 



I- infinite(P/G) 



I- (E^F) € (SxT) <S4> (Pes) a (Per) 



None, derived from [e] 

pec; -^ghp[ 6 ] 

ffi f" P -» 3i C g 2 -> 92 V- P [c] 

None, derived from [-.„] [-. p ] [C] [e] 

g\-p 1 ^p 2 —> g,pi \-p 2 
g,Pi \~P2^ g\-p 1 ^>p 2 

9 f" Pi -» 9 t 1 P2 -» g h pi Ap 2 [a,] 

# hpiAp 2 — > g hpi 
9 hpiAp 2 — > f"P2 

9,P2 I" Pi -» ff,P2 f- Ap x g\~^P2 hp] 
ff, ^P2 f" Pi -»• 3, ^P2 f" ^pi -> h p 2 h„] 

g h e=e 

-» 9 f" P -» 9 f" Tv(i-p) [v,] 
3 f" Tv(i-p) 9 I 1 (i:=e)p 
h eiet{}(«:C2-p)<S>eiee2A(i~ei)p 
3 h ei=e 2 — > 9 h (i:=ei)p — > g h (i:=e2)p 
i\e — > 3 h Ta(j-iee)^>ieee 
ixei — > i\e2 — > <? h eiete2<=>tv(i-ieei=i-iee2) 

3 h eiefe2 — > g h e 2 efei — > g h ei=e2 
g h cjjei7 

ji 7^J2 -^9^fe^ K ) 

g h eii- 1 +e2=e3i-^e4 — > </ h ei=e3 
3 h ei^e2=e3i^+e4 — > g h e2=e4 
iixe€(eixe2) — > i2\ee(eixe2) — ► ii ^ 12 — > 
9 ^Ta(*i •iieeiAta(i2 •i2€e2Ae=iii- 1 n2))<S>ee(ei xe2) 



these observations, probably well known of the B gurus but not documented to 
our knowledge, have led us to replace this B rule by three new rules in order 
to be able to prove the expected theorems. Again, this process illustrates our 
conservative approach. 



5 Proofs in BiCoq 
5.1 Standard B Proofs 

Using the definition of h , we formally prove in BiCoq all prepositional calculus 
and predicate calculus results of [1 , using the functional representation and 
following the proposed proof structure, e.g.: 

r\-[V 2 :=Vi]P Vi\r,P 



ii\<?— > iixp^g h (12 :=ii)p—*g h Tv(*2-p), that is • 



rhvv 2 -p 



To assist the proof construction BiCoq provides Coq tactics written in the Coq 
tactic language [19] . For example, the propositional calculus procedure described 
in pQ, proposing a strategy based on propositional calculus theorems, is provided 
as a Coq tactic. More technical Coq tactics are also available in BiCoq, e.g. to 
obtain proved fresh variables. 

An alternative form of theorems is also derived, using the internal De Bruijn 
representation; e.g. the V-introduction rule (to be compared with [v*]) is: 

— > i\Vp — > g h Inst ilp^ghVp 

These last results are of course rather technical, not benefiting from the func- 
tional representation. Yet they have some interest, for technical lemmas or as 
derived rules in which only semantical side conditions remain (computations over 
De Bruijn indexes dealing with the syntactical ones). 



5.2 Mixing BiCoq and Coq Logics 

As it is standard in such a deep embedding (e.g. see [9]), BiCoq provides also 
results expressing relations between host and guest logics: 

(ghpV ghq)^ghp\/q g I 1 p=$>g -> (g h p -> g h p) 

{g\-pf\g\-q)<r+g\-phq (V(y:I), g h {x~y)p)^g\ L Iv(x-p) 

Asymmetrical results mark the differences between the classical B logic and the 
constructive Coq logic - e.g. a reciprocal of the first rule, combined with the 
excluded middle, would prove that for any predicate p either hp or h ^p, which 
of course is not the case. This emphasises the fact that both logics are well 
separated, the B logic being embedded has an external theory. 

By providing the best of both worlds, these results constitute efficient proof 
tactics. For example, the last theorem does not reflect non-freeness side condi- 
tions from B to the Coq logic (Coq taking care of such conditions automatically). 



6 Developing a Proved B Toolkit 

In this section, we detail how BiCoq is used as a framework for the development 
of formally checked B toolkits. Coq offers mechanisms to extract programs from 
constructive proofs (i.e. software from logical definitions and theorems), but a 
different approach is chosen here. Indeed, BiCoq includes code (in the form of 
functions using the ML-like internal language of Coq) which is proved correct. 
This code is extractible by a pure syntactical process, e.g. in Objective Caml, 
using the extraction mechanism of Coq. By doing so, we obtain proved B tools 
whose code is small, readable and efficient - and independent of Coq. 

Notation. B represents the booleans, T being true and _L being false. 

Notation. Hat notations are used for boolean functions (e.g. A is the boolean and). 

6.1 Implementing Decidable Properties 

For P and / respectively a predicate and a boolean function over a type S, we 
note (P~-»/) when / decides P, i.e. when the following property is proved: 

V( S : S), (f(s) = T -» P(s)) A (f(s) =± -> ^P(s)) 

By defining folding as the extension of predicates and functions to lists, we prove 
that if / decides P, then the folding of / decides the folding of P: 

Foldp (P) := fun(L : [S]) V(s : S), s £ L -> P(s) 

Fold f (/) ~fun(L : [S]) if empty(L) then T else /(head(L))7\Fold f (/)(tail(L)) 
(P-^/) -> (Foldp (P)-w Foldf (/)) 

Example 7 (Non-freeness) . Non-freeness is defined in B as a logical proposition and 
represented by the inductive type x in BiCoq. Our implementation strategy consists 
in developing a program s : I — > T — * B and to prove that (\ ~> \). Hence s and 
its extension (checking that a variable does not occur free in a list of hypotheses) are 
proved correct and can be extracted. 

In BiCoq this approach is systematic; all typed equalities are implemented 
and proved correct (e.g. term equality), as well as non-frccness, list membership, 
inclusion, etc. to constitute our formally checked B toolkit. 

6.2 A Proved Prover for the B Logic 

In this paragraph we focus on the definition of an extractible prover to conduct 
first-order B proofs for standard B developments. 

BiCoq includes programs, named B tactics in the following, to simulate the 
application of B inference rules or theorems. By providing such a dedicated 
piece of code for each of the inference rules listed in Tab. [TJ and by proving 
them correct, we got a correct and complete prover (that is, any standard B 
result can be derived using this prover). 



To this end, a type for sequents is defined as the product [P]xP; for g : [P] and 
p:f we denote glhp the associated pair. While g hp is the type of B proofs of 
p under the assumptions g, that can be inhabited or not, g lh p is a syntactical 
construct extending T. To interpret a sequent, we use the translation Trans^ 
that for a pair g\\~p returns the type g hp (and its extension derived by Fold p ). 

A B tactic is a function T b :lh— > [lh] that, provided a goal g lh p, returns 
a list of subgoals [gi lh pi, . . . , g n lh p n ] which together are sufficient to prove 
g I hp; if a i? tactic concludes (proves the goal) this list is empty. The following 
(elementary) examples give the definition of the B tactics associated respectively 
to the inference rules [e] and [a»]: 

Example 8. T(z(s):—let(g\\-p:—s)inifp£gthenWelse[s] 

Example 9. Taj (s) := let (g\\- p := s) in match p with piAp2 => [g\\- pi, g\\- P2] - => [s] 

The implementation strategy described in Par. I6.1l is now particularly relevant, 
as T 6 uses the boolean function 6 instead of the logical proposition G. 

Following the same principles, numerous (much more complex) B tactics are 
provided in BiCoq, implementing theorems or strategies, such as the decision 
procedure for propositional calculus described in pQ. For each B tactic Tb, the 
correctness is ensured by a proof of the following property: 

V (s :lh), TransK (T B (s)) -> Trans h (s), that is gi h Pl — 9n h Pn 

g\- p 

Thanks to the functions defined in Par. 14.21 management of the De Bruijn 
indexes can be hidden from the users of the B tactics. With the programs already 
provided in BiCoq (such as non-freeness, binding, etc.), these B tactics constitute 
the core of a proved prover. This prover still lacks automation and HMI, and 
should be coupled with other tools, for example a B parser using the platform 
BRILL ANT [§Q|. 



7 Higher-Order Considerations and Extensions 

While the B logic is first-order, various definitions and proofs in [1] are con- 
ducted in a higher-order meta-logic: results in propositional calculus are proved 
by induction over terms, and refinement is defined by quantification over pred- 
icates before being transformed into an equivalent first-order definition. Using 
the higher-order framework provided by Coq, BiCoq can clearly be extended to 
integrate and to formally check such concepts. 

New results can also be derived; for example, using the proof depth function 
T>y- : h— we obtain a depth induction principle on B proof trees e.g. for results 
about proof rewriting. Other results, proved in higher-order logic, are applicable 
in first-order B logic, and implemented as B tactics for standard B proofs. This 
is the case for the following congruence results. 



Predicate Substitution. We extend the B logic syntax with a new predicate 
variable constructor ttK : P (K being an infinite set of names with a decidable 
equality), without adding any inference rules in order not to enrich the BiC'oq 
logi<H. Only limited modifications of BiCoq are required to deal with this new 
constructor, e.g. non-freeness with the additional rule V (i:T)(k:K), z\7Tfc. 

Predicate variables play a role similar to the one of the variables - they are 
placeholders that can be replaced by a predicate using the substitution function 
(k := pi)p2 '■ K — > P — > P — > P, not detailed in this paper, that mimicks the 
expression substitution function (see Par. l4.2[ ). Thanks to this extension, we can 
prove the following congruence rules for <£> and implement associated B tactics 
that can be used e.g. to unfold a definition in a term, even under binders: 
gh pi&p 2 ->g\ L (j: = p 1 )p-^{j: = p 2 )p g h pi4>p 2 ^ g h (j :=pi)e=(j :=p 2 )e 

Example 10. x=0, j/€N h y < x&y=0, therefore we immediately derive (in one step) 
x=0,jyeNh h(v-v£-\ {} (t:N-t<yAy<x))^y(v-ve]' {} (t:n-t<yAy=0)) 

Note that predicate substitution and expression substitution mechanically forbid 
the capture of variables in the substituted subterm, by lifting dangling De Bruijn 
indexes when crossing a binder. That is, in Ex. 1101 if v or t appear free in the 
substituted subterm, they escape capture during substitution. 

Predicate Grafting. Other congruence results can be derived for grafting of 
predicates, a modified substitution (not lifting the substituted subterm) allowing 
for the capture of variables: 

(k^p)t:K^f ■-matcht with 

I ^ I Uj* | Xi> t 

■ky =>■ if k' = k then p else t 

I vy ^v(fc<ip)p' 

I W\p'} => {(k<p)e'\(k<p)p'} 

. . . (straightforward extension) 

The associated congruence results and proofs are technical, and not detailed in 
this paper. We just provide for illustration a simplified version of these results: 

I- P1&P2 —» g f- (j<Pi)p&{j<!P2)p f- P1&P2 -* g t 1 {j <pi)e=(j <p2)e 

Example 11. g h {k<- L r L ip)q-&{j <p)q, that is the elimination of double negations in a 
subterm (even if dangling De Bruijn indexes of p are bound in q) 

Remark. Results such as the ones in Exs . [TU1 or [TT1 are provable in B, on a case- 
by-case basis, with a first-order proof depending on the structure of the term 
in which substitution or grafting is done. It is therefore conceivable to develop 
a specific (and likely complex) B tactic automatically building for such goals 
a proof using the B inference rules. On the contrary, the proposed extensions 
provide a new approach through results derived from a higher-order proof; the 
associated B tactics are therefore simpler, and produce generic (and shorter) 
proofs by using not only the B inference rules but also induction on T. 

8 However, some new (propositional) sequents became provable, such as -kh h -kk ■ 



8 Conclusion 



Through an accurate deep embedding of the B logic in Coq, we identify shortfalls 
or confusions in pQ and propose amendments in order to be able to validate stan- 
dard results - improving the confidence in the method and in the developments 
conducted with it. We describe a strategy to further benefit from this deep em- 
bedding by implementing verified B tools, extractible to be used independently 
of Coq. The approach is illustrated by the development of B tactics that consti- 
tute a complete and correct prover - usable to conduct proofs (provided further 
automation), or to check proofs produced by other tools. The objective, again, 
is to have better confidence in the developments conducted in B. 

We also explain how, benefiting from the higher-order features of Coq, new 
results for B can be derived, and present an extension to derive congruence 
theorems related to equivalence, implemented in our prover. 

All the results presented in this paper are mechanically checked; BiCoq cur- 
rently represents about 550 definitions (i.e. types, properties, functions), 750 
theorems and proofs in Coq - and about 6 man. months of development. It has 
now to be extended with the following definitions and results: 

— Generation by the prover of B proof terms checkable by Coq. 

— Use of a locally nameless De Bruijn representation with named free variables 
to derive unified congruence results (merging substitution and grafting). 

— Fixpoint constructs, with application to the definition of natural numbers in 
the B style; on the innovative side, we expect to derive inductive B tactics, 
not available in current B implementations. 

— GSL definition - either through a shallow embedding (an approach similar to 
the one presented in [6], but in BiCoq) or through a deep embedding (with 
higher-order and first-order refinement definitions, and proof of equivalence). 

We would like to emphasise the simplicity and the efficiency of the deep em- 
bedding approach, when having both validation and implementation objectives. 
In a relatively short amount of time, it was possible to describe the B logic, to 
check its standard results, and to implement a proved prover for this logic. 
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